Project Details


Home


Cellecta DriverMap AIR

   The DriverMap Adaptive Immune Receptor (AIR) Repertoire Profiling Service from Cellecta provides you with a profile of all TCR and BCR CDR3 or full-length variable regions in blood, cell, or RNA samples. With the DriverMap AIR TCR-BCR Profiling Service, you get a larger complement of clonotypes than other similar assays, reproducible and comprehensive coverage from a range of immune sample inputs, including total RNA from whole blood and Rapid, 1-month turnaround from sample submission to an extensive analysis report

    Since T- and B-cells work synergistically in the adaptive immune response, Cellecta has designed an assay that profiles both T-cell receptor (TCR) and B-cell receptor (BCR) repertoires in a single convenient reaction. Separate assays specific for T- or B-cell chains are also available. The DriverMap AIR-RNA assay quantifies T-cell and B-cell receptor transcripts. It is designed to specifically amplify only functional RNA molecules from human or mouse TCR and BCR cells, avoiding non-functional pseudogenes with similar structures or full-length variable regions from human RNA molecules enables highly sensitive detection of low-frequency, rare TCR and BCR clonotypes and more comprehensive profiling when working with small samples and limited numbers of cells. The DriverMap AIR-DNA assay amplifies receptor genes directly from genomic DNA. The AIR-DNA assay provides a more quantitative measurement of the genetic copies for each CDR3-specific clonotype which correlates to the number of cells with that clonotype in that sample. This data enables the measurement of clonal expansion in T and B cells. Combining data obtained from both the AIR-DNA and AIR-RNA assays enables assessment of both the transcriptional activation and number of cells with a particular clonotype. The ability to differentiate these two effects provides a quantitative basis to assess antigen-activated clonotypes

    Applications of BCR sequencing: Identify broadly neutralizing antibodies (BNAbs) and map Ig-seq datasets to known antibody structures for antibody and vaccine development, Track B-cell migration and development patterns, Find markers of autoimmune diseases such as multiple sclerosis, rheumatoid arthritis and cancers (e.g. B-cell lymphoma), and Contrast nave and antigenically challenged datasets to understand antibody maturation.

    Applications of TCR sequencing: Track T-cell clonality and diversity for insights into mechanisms of action of immune checkpoint inhibitors for immunotherapies, Assess TCR overlap between repertoires to define spatial and temporal heterogeneity of the anti-tumoral immune response, and Analyze TCR sequence and structure to annotate antigenic specificity for developing personalized cellular immunotherapies

How is the DriverMap AIR Assay Different from other Adaptive Immune Receptor Repertoire (AIRR) Assays?

   DriverMap<84> Multiplex PCR technology uses gene-specific primers which significantly reduce the level of non-specific binding and primer-dimer amplification products, and are designed to target only TCR/BCR isoforms. Unique Molecular Identifiers (UMIs) facilitate accurate quantitation of the copy number of cDNA or DNA molecules in amplification steps, as well as detection of low abundance clonotypes and correction of amplification biases and sequencing errors. Dual-index amplicon labeling strategy minimizes index hopping during NGS allowing for comprehensive readouts. Full profiles of the antigen-recognition CDR3 region enable assessment of CDR3 length distribution, V(D)J segment usage, isotype composition for BCRs, somatic mutations, and similar characteristics with immune receptor profiling software such as MiXCR (MiLabs).

DriverMap Adaptive Imumune Repertoire (AIR) profiling Assay workflow is as follow:

Bioinformatics Workflow

   The protocol in this section describe how to extract T- and B- Cell receptor repertoire from NGS data generated from the Cellecta DriverMap AIR kit. MiXCR is used to analyze NGS data, extract clonotype and obtain various plots and tabular results. More details can be abtained from MiXCR official website (MiXCR Cellecta Preset). immunarch is an R package designed to analyse T-cell receptor (TCR) and B-cell receptor (BCR) repertoires. Post analysis can be done by MiXCR or immunarch. The following bioinformatics workflow is recommended for DriverMap AIR assay:

Sequencing Quality

mixcr_qc_report.knit

QC/Sequencing

Sequencing Statistics

sample pct.dup pct.gc tot.seq seq.length
10_113046_R1 97.46 53 13029992 148
10_113046_R2 91.57 52 13029992 148
11_113047_R1 97.58 55 6216885 148
11_113047_R2 94.82 54 6216885 148
1_113026_R1 97.42 53 9004834 148
1_113026_R2 95.94 52 9004834 148
2_113030_R1 95.43 56 1416088 148
2_113030_R2 83.55 56 1416088 148
3_113032_R1 97.41 53 5922502 148
3_113032_R2 93.80 52 5922502 148
4_113033_R1 96.56 53 1757522 148
4_113033_R2 93.68 52 1757522 148
5_113034_R1 96.13 55 2811555 148
5_113034_R2 89.55 54 2811555 148
6_113037_R1 97.77 54 9302919 148
6_113037_R2 94.99 53 9302919 148
7_113038_R1 97.70 54 11911843 148
7_113038_R2 95.98 53 11911843 148
8_113042_R1 97.34 53 12495147 148
8_113042_R2 94.51 52 12495147 148
9_113044_R1 97.47 54 13514357 148
9_113044_R2 92.05 53 13514357 148


pct.dup = Sequence Duplication Rate
pct.gc = GC Percentage
tot.seq = Total Number of Reads
seq.length = Sequencing Length (NT)

Poor quality samples

sample nb_problems module
2_113030_R1 5 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences, Adapter Content
2_113030_R2 5 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences, Adapter Content
3_113032_R1 5 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences, Adapter Content
3_113032_R2 5 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences, Adapter Content
5_113034_R1 5 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences, Adapter Content
5_113034_R2 5 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences, Adapter Content
11_113047_R1 4 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences
11_113047_R2 4 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences
4_113033_R1 4 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences
4_113033_R2 4 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences
6_113037_R1 4 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences
6_113037_R2 4 Per base sequence content, Per sequence GC content, Sequence Duplication Levels, Overrepresented sequences
10_113046_R1 3 Per base sequence content, Sequence Duplication Levels, Overrepresented sequences
10_113046_R2 3 Per base sequence content, Per sequence GC content, Sequence Duplication Levels
1_113026_R1 3 Per base sequence content, Sequence Duplication Levels, Overrepresented sequences
1_113026_R2 3 Per base sequence content, Per sequence GC content, Sequence Duplication Levels
7_113038_R1 3 Per base sequence content, Sequence Duplication Levels, Overrepresented sequences
7_113038_R2 3 Per base sequence content, Per sequence GC content, Sequence Duplication Levels
8_113042_R1 3 Per base sequence content, Sequence Duplication Levels, Overrepresented sequences
8_113042_R2 3 Per base sequence content, Per sequence GC content, Sequence Duplication Levels
9_113044_R1 3 Per base sequence content, Sequence Duplication Levels, Overrepresented sequences
9_113044_R2 3 Per base sequence content, Per sequence GC content, Sequence Duplication Levels


nb_problems = Number of criteria that failed
module = List of criteria that failed

Summary of FastQC Calls

10_113046_R1 10_113046_R2 11_113047_R1 11_113047_R2 1_113026_R1 1_113026_R2 2_113030_R1 2_113030_R2 3_113032_R1 3_113032_R2 4_113033_R1 4_113033_R2 5_113034_R1 5_113034_R2 6_113037_R1 6_113037_R2 7_113038_R1 7_113038_R2 8_113042_R1 8_113042_R2 9_113044_R1 9_113044_R2
Basic Statistics PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS
Per base sequence quality PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS
Per tile sequence quality PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS
Per sequence quality scores PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS
Per base sequence content FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL
Per sequence GC content WARN FAIL FAIL FAIL PASS FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL PASS FAIL WARN FAIL PASS FAIL
Per base N content PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS
Sequence Length Distribution PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS
Sequence Duplication Levels FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL
Overrepresented sequences FAIL WARN FAIL FAIL FAIL WARN FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL FAIL WARN FAIL WARN FAIL WARN
Adapter Content PASS PASS WARN WARN PASS PASS FAIL FAIL FAIL FAIL WARN WARN FAIL FAIL WARN WARN PASS PASS PASS PASS PASS PASS


QC/MiXCR Alignment

MiXCR Alignment Calls

10_113046 11_113047 1_113026 2_113030 3_113032 4_113033 5_113034 6_113037 7_113038 8_113042 9_113044
Successfully aligned reads: OK WARN OK ALERT WARN WARN ALERT OK OK OK OK
Off target (non TCR/IG) reads: OK OK OK OK OK OK OK OK OK OK OK
Reads with no V or J hits: OK WARN OK ALERT WARN WARN ALERT WARN OK OK OK
Reads with no barcode: OK OK OK OK OK OK OK OK OK OK OK
Overlapped paired-end reads: OK OK OK ALERT OK OK WARN OK OK OK OK
Alignments that do not cover VDJRegion: NA NA NA NA NA NA NA NA NA NA NA
Tag groups that do not cover VDJRegion: NA NA NA NA NA NA NA NA NA NA NA
Barcode collisions in clonotype assembly: OK OK OK OK OK OK OK OK OK OK OK
Unassigned alignments in clonotype assembly: OK OK OK OK OK OK OK OK OK OK OK
Reads used in clonotypes: OK WARN OK ALERT WARN WARN ALERT WARN OK OK OK
Alignments dropped due to low sequence quality: OK OK OK OK OK OK OK OK OK OK OK
Alignments clustered in PCR error correction: NA NA NA NA NA NA NA NA NA NA NA
Clonotypes clustered in PCR error correction: NA NA NA NA NA NA NA NA NA NA NA
Clones dropped in post-filtering: OK OK OK OK OK OK OK OK OK OK OK
Alignments dropped in clones post-filtering: OK OK OK OK OK OK OK OK OK OK OK
Reads dropped in tags error correction and filtering: OK OK OK OK OK OK OK OK OK OK OK
UMIs artificial diversity eliminated: ALERT ALERT ALERT ALERT ALERT ALERT ALERT ALERT ALERT ALERT ALERT
Reads dropped in UMI error correction and whitelist: OK OK OK OK OK OK OK OK OK OK OK
Reads dropped in tags filtering: OK OK OK OK OK OK OK OK OK OK OK


MiXCR Alignment Statistics

10_113046 11_113047 1_113026 2_113030 3_113032 4_113033 5_113034 6_113037 7_113038 8_113042 9_113044
Successfully aligned reads: 98.34% 82.76% 92.98% 16.39% 80.22% 84.58% 53.48% 87.38% 95.79% 96.2% 98.23%
Off target (non TCR/IG) reads: 0.22% 1.48% 0.5% 5.94% 1.79% 1.04% 4.4% 0.41% 0.32% 0.22% 0.39%
Reads with no V or J hits: 1.29% 15.72% 6.47% 77.66% 17.95% 14.33% 42.09% 12.17% 3.84% 3.32% 1.25%
Reads with no barcode: 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
Overlapped paired-end reads: 99.03% 95.91% 98.7% 79.32% 94.5% 97.34% 88.44% 96.75% 98.98% 98.79% 99.01%
Alignments that do not cover VDJRegion: NA NA NA NA NA NA NA NA NA NA NA
Tag groups that do not cover VDJRegion: NA NA NA NA NA NA NA NA NA NA NA
Barcode collisions in clonotype assembly: 0.12% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.06% 0.09%
Unassigned alignments in clonotype assembly: 0.91% 0.32% 1.14% 0.16% 0.26% 0.22% 0.26% 0.37% 0.77% 0.33% 0.92%
Reads used in clonotypes: 96.16% 82.21% 91.49% 16.32% 79.69% 83.99% 53.16% 86.58% 93.72% 94.84% 95.99%
Alignments dropped due to low sequence quality: 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
Alignments clustered in PCR error correction: NA NA NA NA NA NA NA NA NA NA NA
Clonotypes clustered in PCR error correction: NA NA NA NA NA NA NA NA NA NA NA
Clones dropped in post-filtering: 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
Alignments dropped in clones post-filtering: 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
Reads dropped in tags error correction and filtering: 0.81% 0.35% 0.47% 0.3% 0.4% 0.47% 0.34% 0.54% 0.46% 0.71% 0.79%
UMIs artificial diversity eliminated: 79.12% 66.63% 68.56% 65.91% 65.11% 68.07% 66.51% 59.69% 71.21% 69.5% 78.27%
Reads dropped in UMI error correction and whitelist: 0.02% 0.01% 0.01% 0.01% 0.01% 0.01% 0.0% 0.01% 0.01% 0.01% 0.01%
Reads dropped in tags filtering: 0.79% 0.34% 0.46% 0.3% 0.39% 0.46% 0.34% 0.54% 0.45% 0.7% 0.78%


Alignment Percentages


Clonotype Summary

clonotype_summary_TCR.knit

Chain Usage Summary

fileName totalReads totalClonotypes clonesWithChain.TRA clonesWithChain.TRB clonesWithChain.TRD clonesWithChain.TRG
10_113046/10_113046.clns 13029992 13455 6800 5900 109 571
11_113047/11_113047.clns 6216885 61 26 34 NA 1
1_113026/1_113026.clns 9004834 415 212 181 1 16
2_113030/2_113030.clns 1416088 4 3 1 NA NA
3_113032/3_113032.clns 5922502 62 31 26 NA 5
4_113033/4_113033.clns 1757522 47 28 17 NA 2
5_113034/5_113034.clns 2811555 19 14 5 NA NA
6_113037/6_113037.clns 9302919 113 50 58 NA 5
7_113038/7_113038.clns 11911843 967 442 489 4 16
8_113042/8_113042.clns 12495147 1130 531 543 7 46
9_113044/9_113044.clns 13514357 15987 7438 7963 100 324


Total Number of Clonotypes


Chain Composition in the Repertoire


TRAD

immunarch_markdown.knit


TRAD/Samples in Analysis


The following samples were included in this analysis.

10_113046
11_113047
1_113026
2_113030
3_113032
4_113033
5_113034
6_113037
7_113038
8_113042
9_113044


TRAD/Repertoire Statistics


This section shows repertoire statistical measures in each sample.


Unique clonotype in each sample

Table: The Volume column corresponds to the number of unique clonotypes in each sample.


Total clonotype counts in each sample

Table: The Clones column corresponds to the total clonotype counts in each sample.


Clonotype Abundance


CDR3 Region Length

TRAD/Top Clonotypes


Most Abundant Clonotypes



Frequency of Most Abundant Clonotypes



Repertoire Space by Clonotype Index



Most Varying Clonotypes



Frequency of Most Varying Clonotypes



TRAD/Repertoire Overlap


This section shows overlap of repertoires between samples. The two metrics used are the overlap between public or shared clonotypes and the Morisita overlap index.


Public clonotypes overlap

Table: The values correspond to the number of shared clonotypes between two samples.



TRAD/Gene Usage Statistics


This section quantifies the usage of VDJ genes in the repertoire.


Top 10 Used Genes

The top 10 used genes are selected by taking the mean frequency each gene is used across all datasets.


Gene Usage

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.


TRAD/Gene Usage Overlap


This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.


Gene usage correlation

Table: The values correspond to the correlation metric of the gene usage between two samples.



TRAD/Diversity Metrics


This section quantifies commonly used metrics in species (i.e.clonotype) diversity. The tables that follow each metric contains relevant information from each metric calculation.


Chao1 Diversity


D50 Diversity Index


True Diversity


Gini-Simpson Index


Inverse Simpson Index


TRB

immunarch_markdown.knit


TRB/Samples in Analysis


The following samples were included in this analysis.

10_113046
11_113047
1_113026
2_113030
3_113032
4_113033
5_113034
6_113037
7_113038
8_113042
9_113044


TRB/Repertoire Statistics


This section shows repertoire statistical measures in each sample.


Unique clonotype in each sample

Table: The Volume column corresponds to the number of unique clonotypes in each sample.


Total clonotype counts in each sample

Table: The Clones column corresponds to the total clonotype counts in each sample.


Clonotype Abundance


CDR3 Region Length

TRB/Top Clonotypes


Most Abundant Clonotypes



Frequency of Most Abundant Clonotypes



Repertoire Space by Clonotype Index



Most Varying Clonotypes



Frequency of Most Varying Clonotypes



TRB/Repertoire Overlap


This section shows overlap of repertoires between samples. The two metrics used are the overlap between public or shared clonotypes and the Morisita overlap index.


Public clonotypes overlap

Table: The values correspond to the number of shared clonotypes between two samples.



TRB/Gene Usage Statistics


This section quantifies the usage of VDJ genes in the repertoire.


Top 10 Used Genes

The top 10 used genes are selected by taking the mean frequency each gene is used across all datasets.


Gene Usage

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.


TRB/Gene Usage Overlap


This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.


Gene usage correlation

Table: The values correspond to the correlation metric of the gene usage between two samples.



TRB/Diversity Metrics


This section quantifies commonly used metrics in species (i.e.clonotype) diversity. The tables that follow each metric contains relevant information from each metric calculation.


Chao1 Diversity


D50 Diversity Index


True Diversity


Gini-Simpson Index


Inverse Simpson Index


TRG

immunarch_markdown.knit


TRG/Samples in Analysis


The following samples were included in this analysis.

10_113046
11_113047
1_113026
2_113030
3_113032
4_113033
5_113034
6_113037
7_113038
8_113042
9_113044


TRG/Repertoire Statistics


This section shows repertoire statistical measures in each sample.


Unique clonotype in each sample

Table: The Volume column corresponds to the number of unique clonotypes in each sample.


Total clonotype counts in each sample

Table: The Clones column corresponds to the total clonotype counts in each sample.


Clonotype Abundance


CDR3 Region Length

TRG/Top Clonotypes


Most Abundant Clonotypes



Frequency of Most Abundant Clonotypes



Repertoire Space by Clonotype Index



Most Varying Clonotypes



Frequency of Most Varying Clonotypes



TRG/Repertoire Overlap


This section shows overlap of repertoires between samples. The two metrics used are the overlap between public or shared clonotypes and the Morisita overlap index.


Public clonotypes overlap

Table: The values correspond to the number of shared clonotypes between two samples.



TRG/Gene Usage Statistics


This section quantifies the usage of VDJ genes in the repertoire.


Top 10 Used Genes

The top 10 used genes are selected by taking the mean frequency each gene is used across all datasets.


Gene Usage

Table: The values correspond to the frequency of usage for each gene in each sample. The rows are ordered to show more frequently used genes first.


TRG/Gene Usage Overlap


This section quantifies the similarity of gene usage across the samples. The metrics used are the Jensen-Shannon Divergence, which measures the dissimilarity between samples, and the gene usage correlation.


Gene usage correlation

Table: The values correspond to the correlation metric of the gene usage between two samples.



TRG/Diversity Metrics


This section quantifies commonly used metrics in species (i.e.clonotype) diversity. The tables that follow each metric contains relevant information from each metric calculation.


Chao1 Diversity


D50 Diversity Index


True Diversity


Gini-Simpson Index


Inverse Simpson Index


Appendix

Methods of unzip compressed files

   Compressed files in the format of *.gz:

      Unix/Linux/Mac user use “gzip *.gz” command

      Windows user use uncompressed software such as WinRAR, 7-Zip et al

   Compressed files in the format of *.zip:

      Unix/Linux/Mac user use “unzip *.zip” command

      Windows user use uncompressed software such as WinRAR, 7-Zip et al

How to operate different format data files

   *.fastq reads sequence file, in the format of fasta. it is not easy to open since it is a large big file.

      Unix/Linux/Mac users use less or more commands;

      Windows users use editor Editplus/Notepad++ et al

   .xls,.txt, *.tsv table result file; files are separated by(Tab)

      Unix/Linux/Mac users use “less” or “more” commands

      Windows users use editor Editplus/Notepad++ et al, also can use Microsoft Excel to open.

Software catalog:

   FastQC v0.11.9

   MiXCR v4.5.0

   R V4.3.1

Reference

   Cock P J A, Fields C J, Goto N, et al. (2010). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic acids research 38, 1767-1771. (FASTQ)

   Bolotin DA, Poslavsky S, Mitrophanov I, Shugay M, Mamedov IZ, et al. (2015) MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods 12: 380.381. 10.1038/nmeth.3364

   Shugay M, Bagaev D V., Turchaninova M a., Bolotin D a., Britanova O V., Putintseva E V., et al. VDJtools: unifying post-analysis of T cell receptor repertoires. PLoS Comput Biol 2015;11:e1004503

   Erlich Y, Mitra PP, delaBastide M, et al. (2008). Alta-Cyclic: a self-optimizing base caller for next-generation sequencing.Nat Methods. 2008 Aug;5(8):679-82.(sequencing error rate distribution)

   Jiang L, Schlesinger F, Davis CA, et al. (2011). Synthetic spike-in standards for RNA-seq experiments.Genome Res. 2011 Sep;21(9):1543-51. (sequencing error rate distribution)

   Knig, J., Zarnack, K., Rot, G., et al. (2010). iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nature structural & molecular biology, 17(7), 909-915.

   Parekh, S., Ziegenhain, C., Vieth, B., et al. (2016). The impact of amplification on differential expression analyses by RNA-seq. Scientific reports, 6(1), 1-11.

   Fu, Y., Wu, P. H., Beane, T., et al. (2018). Elimination of PCR duplicates in RNA-seq and small RNA-seq using unique molecular identifiers. Bmc Genomics, 19(1), 1-14.

   Kennedy, S. R., Schmitt, M. W., Fox, E. J., et al. (2014). Detecting ultralow-frequency mutations by Duplex Sequencing. Nature protocols, 9(11), 2586-2606.

   Smith, T., Heger, A., & Sudbery, I. (2017). UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome research, 27(3), 491-499.

R Session Information


#> R version 4.3.2 (2023-10-31)
#> Platform: x86_64-conda-linux-gnu (64-bit)
#> Running under: CentOS Linux 7 (Core)
#> 
#> Matrix products: default
#> BLAS/LAPACK: /home/Cellecta123/miniconda3/envs/AIR/lib/libopenblasp-r0.3.25.so;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] C
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] pheatmap_1.0.12   fastqcr_0.1.3     yaml_2.3.7        htmltools_0.5.7  
#>  [5] lubridate_1.9.3   forcats_1.0.0     purrr_1.0.2       readr_2.1.4      
#>  [9] tidyr_1.3.0       tibble_3.2.1      tidyverse_2.0.0   xfun_0.41        
#> [13] cowplot_1.1.1     gginnards_0.1.2   DT_0.30           stringr_1.5.1    
#> [17] kableExtra_1.3.4  immunarch_0.9.0   patchwork_1.1.3   data.table_1.14.8
#> [21] dtplyr_1.3.1      dplyr_1.1.4       ggplot2_3.4.4     knitr_1.45       
#> [25] rmarkdown_2.25   
#> 
#> loaded via a namespace (and not attached):
#>   [1] RColorBrewer_1.1-3  rstudioapi_0.15.0   jsonlite_1.8.7     
#>   [4] shape_1.4.6         magrittr_2.0.3      modeltools_0.2-23  
#>   [7] farver_2.1.1        GlobalOptions_0.1.2 vctrs_0.6.4        
#>  [10] rstatix_0.7.2       webshot_0.5.5       broom_1.0.5        
#>  [13] cellranger_1.1.0    sass_0.4.7          bslib_0.6.1        
#>  [16] htmlwidgets_1.6.3   plyr_1.8.9          cachem_1.0.8       
#>  [19] uuid_1.1-1          igraph_1.5.1        mime_0.12          
#>  [22] lifecycle_1.0.4     iterators_1.0.14    pkgconfig_2.0.3    
#>  [25] Matrix_1.6-3        R6_2.5.1            fastmap_1.1.1      
#>  [28] shiny_1.8.0         digest_0.6.33       colorspace_2.1-0   
#>  [31] ggpubr_0.6.0        timechange_0.2.0    fansi_1.0.5        
#>  [34] httr_1.4.7          polyclip_1.10-6     abind_1.4-5        
#>  [37] compiler_4.3.2      withr_2.5.2         doParallel_1.0.17  
#>  [40] backports_1.4.1     carData_3.0-5       viridis_0.6.4      
#>  [43] UpSetR_1.4.0        ggforce_0.4.1       ggsignif_0.6.4     
#>  [46] MASS_7.3-60         tools_4.3.2         ape_5.7-1          
#>  [49] prabclus_2.3-3      httpuv_1.6.12       ggseqlogo_0.1      
#>  [52] nnet_7.3-19         glue_1.6.2          quadprog_1.5-8     
#>  [55] nlme_3.1-164        promises_1.2.1      grid_4.3.2         
#>  [58] stringdist_0.9.12   cluster_2.1.5       reshape2_1.4.4     
#>  [61] generics_0.1.3      gtable_0.3.4        tzdb_0.4.0         
#>  [64] class_7.3-22        hms_1.1.3           tidygraph_1.2.3    
#>  [67] xml2_1.3.5          car_3.1-2           utf8_1.2.4         
#>  [70] flexmix_2.3-19      ggrepel_0.9.4       foreach_1.5.2      
#>  [73] pillar_1.9.0        later_1.3.1         robustbase_0.99-1  
#>  [76] circlize_0.4.15     tweenr_2.0.2        lattice_0.22-5     
#>  [79] tidyselect_1.2.0    gridExtra_2.3       svglite_2.1.2      
#>  [82] stats4_4.3.2        graphlayouts_1.0.2  diptest_0.77-0     
#>  [85] factoextra_1.0.7    DEoptimR_1.1-3      stringi_1.8.2      
#>  [88] evaluate_0.23       codetools_0.2-19    kernlab_0.9-32     
#>  [91] ggraph_2.1.0        cli_3.6.1           shinythemes_1.2.0  
#>  [94] xtable_1.8-4        systemfonts_1.0.5   jquerylib_0.1.4    
#>  [97] munsell_0.5.0       Rcpp_1.0.11         readxl_1.4.3       
#> [100] parallel_4.3.2      ellipsis_0.3.2      mclust_6.0.1       
#> [103] ggalluvial_0.12.5   phangorn_2.11.1     viridisLite_0.4.2  
#> [106] rlist_0.4.6.2       scales_1.3.0        fpc_2.2-10         
#> [109] rlang_1.1.2         fastmatch_1.1-4     rvest_1.0.3